Item generation was undertaken in several stages. The first step in item generation consisted of items generated in two scales (harm reduction principles and harm reduction strategies) which were drafted by the authors and piloted with harm reduction workers in local training and harm reduction supply distribution evaluation. Following the initial pilot project, we revised and added items to address gaps that had been noted. The items were generated with regard to both the SAMHSA Harm Reduction Framework and the National Harm Reduction Coalition’s Harm Reduction Principles(https://harmreduction.org/about-us/principles-of-harm-reduction/). Once the initial item pool was generated, we consulted with Dr. David Frank, a harm reduction expert and researcher. Based on his consultation, items were further modified to be accessible to a general population.
We used Wolf et al’s (2023) Response Process Evaluation (RPE) method to ensure that a non-expert, non-harm reduction audience would understand the item content. Over X rounds of data collection, we used Prolific to collect qualitative responses to each item. As a group, we evaluated each set of responses to determine whether respondents understood or did not understand the intended meaning of each item. Based on respondent answers and suggestions, we identified items which items were generally understood, and which items should be modified or removed. The final item pool contained 45 items related to harm reduction strategies and principles. We phrased items as simply and directly as possible, occasionally splitting complex topics (experiences of discrimination, other topic) across multiple items when necessary. For study 1, items were answered on a 6-point ordinal scale from 1 (Strongly Disagree) to 6 (Strongly Agree). Following inspection of item responses, studies 2 and 3 used 5-point ordinal scales from 1 (Strongly Disagree) to 5 (Strongly Agree). Participants answered items in a random order, and completed a brief demographic survey.
People who inject drugs should be able to do so in a way that prevents them from causing further harm to their health.
People should have access to tools for safer sex (like condoms, STD tests).
People who use drugs should have access to naloxone/NARCAN.
The general public should have access to naloxone/NARCAN.
Police officers should have access to naloxone/NARCAN.
People who use drugs should have access to tools to test what's in their drugs.
People who use drugs should have access to safe injection supplies (sterile needles and syringes).
People who use drugs should have access to safe inhalation supplies (glass stems and pipes).
People who actively use drugs should have access to therapy/counseling.
People who return to using drugs after a period of abstinence should be allowed to continue in treatment.
Sobriety should be required for treatment.
Medications used to treat addiction (buprenorphine, naltrexone, or methadone) are an appropriate treatment option for people who use drugs.
Possession of "drug paraphernalia", like syringes and pipes, should be legal.
People who seek medical assistance for overdoses should be protected from drug charges, arrests, and prosecutions.
Sobriety should not be a requirement to access public housing.
Possession of all drugs should be decriminalized (possession would not lead to legal repercussions).
It should be legal for adults to purchase drugs from a dispensary/shop.
People who use drugs should have access to supervised places where they can consume drugs safely.
People who use drugs should have access to a legal, non-contaminated drug supply.
People use drugs to escape.
People should be able to use drugs safely.
People who use drugs should be treated with respect.
Poverty affects the health of people who use drugs.
Racism affects the health of people who use drugs.
Gender-based discrimination affects the health of people who use drugs.
Some ways of using drugs are safer than others.
People who use drugs deserve to live good lives.
Reducing drug use is a reasonable goal for people who use drugs.
Some people who use drugs cannot be expected to quit immediately.
People who use drugs should be involved in creating the programs and policies that serve them.
People in recovery from drug use should be involved in creating the programs and policies that serve them.
Relapse may be a part of the recovery process.
It is possible to live a healthy life without stopping drug use.
People who use drugs should be forced into treatment.
Using drugs is immoral.
Drug use has benefits.
Harm reduction complements traditional addiction prevention, treatment, and recovery services.
People who use drugs benefit society.
Reducing the negative consequences of drug use encourages more people to use drugs.
People will use more drugs if it is safer.
People who use drugs will naturally end up homeless.
Drugs make the world worse.
Drug use will always be part of society.
Chaotic drug use is a rational response to experiences like trauma, homelessness, hunger, and poverty.
People who use drugs should be able to use medications used to treat addiction (buprenorphine, naltrexone, or methadone) for any length of time.
Study 1
Study 1 was preregistered, which can be accessed at https://osf.io/yt7sm. In this study, we conducted exploratory factor analysis using an oblimin rotation based on a polychoric correlation matrix using maximum likelihood estimation for factor extraction. We used the 40-30-20 rule to determine factor loading cut off, which means that retained items should load on their primary factor above 0.4, load onto alternative factors below 0.3, and show a difference of 0.2 between primary and alternative factor loadings. Finally, scale reliability will be assessed using Cronbachs’ Alpha, the Greatest Lower Bound, and average split half reliability.
Participants
Code
# Librarieslibrary(psych)library(correlation)library(EFA.dimensions)library(GPArotation)# Import Datadf <-read.csv("data/clean/20231011_hr-scale-exploratory-data.csv")# Build Demographic Data Setdemos <- df %>%select(age, gid1, race, ed, sud_hx)# Build Data Set of Itemshr <- df %>%# N = 301select(starts_with("q"), -Q_RecaptchaScore)
Through Prolific, we recruited 301 participants for Study 1. These participants were 50% men (n = 149), 70% White (n = 210), 41% had completed a bachelor’s degree (n = 122), 88% reported no history of a substance use disorder (SUD) diagnosis, and had an average age of 36. Table 2 provides additional detail about this sample.
Code
library(gtsummary)demos %>%tbl_summary(missing ="no",statistic =list(age ="{mean} ({sd})"),digits =list(age ~c(2,2)),sort =list(everything() ~"frequency"),label =list( age ~"Age", gid1 ~"Gender Identity", race ~"Race/Ethnicity", ed ~"Highest Level of Education", sud_hx ~"History of Substance Use Disorder Diagnosis") ) %>%as_gt() %>%tab_header(title ="Table 2: Study 1 Participants")
Table 2: Study 1 Participants
Characteristic
N = 3011
Age
35.92 (11.28)
Gender Identity
Man
149 (50%)
Woman
143 (48%)
Non-Binary or Gender Fluid
7 (2.3%)
Prefer not to say
1 (0.3%)
Race/Ethnicity
White
210 (70%)
Black or African American
30 (10.0%)
Asian or Asian American
28 (9.3%)
Multiracial
13 (4.3%)
Latine or Hispanic
12 (4.0%)
Prefer not to say
4 (1.3%)
Indigenous, Aboriginal, or First Nations
2 (0.7%)
Prefer to self-describe
2 (0.7%)
Highest Level of Education
Bachelor’s Degree
122 (41%)
High School
97 (32%)
Master’s Degree
39 (13%)
Associate’s Degree
28 (9.3%)
Doctorate Degree (PhD, PsyD, etc.)
4 (1.3%)
Professional Degree (J.D., M.D., etc.)
4 (1.3%)
Less than High School
3 (1.0%)
Other option not represented here
2 (0.7%)
Prefer not to say
1 (0.3%)
History of Substance Use Disorder Diagnosis
No
265 (88%)
Yes
28 (9.3%)
Prefer not to say
5 (1.7%)
I don’t know
3 (1.0%)
1 Mean (SD); n (%)
Initial Data Inspection
Code
bart <-cortest.bartlett(hr)kmo <-KMO(hr)
Using a polychoric correlation matrix, we analyzed Bartlett’s Sphericity and the Kaiser-Meyer-Olkin test to ensure that the data were appropriate for factor analysis. Our data appeared adequate for factor analysis, with Kaiser Meyer-Olkin = 0.937083, and Barlett’s sphericity \(\chi^2\)(990) = 8275.1008965, p = 0. Inspection of the correlation matrix showed many inter-item correlations > .40, indicating that a factor solution should be possible.
Local Dependence Between Items
Code
# Saved because LOCALDEP takes forever to run# ld <- LOCALDEP(hr)# saveRDS(ld, "study1-local-dependence.rds")ld <-readRDS("study1-local-dependence.rds")
Initial inspection did reveal excessively high inter-item correlations that caused concern. First, there were 14 inter-item correlations which were > .70, and 2 partial correlations > .60. Using the EFA.dimensions package, inspection of possible local dependence was done via Q3, X2, and and G2 statistics. Possible locally dependent items were inspected pair-by-pair. Those that were highly similar were removed, while those that appeared statistically locally dependent but different content wise were retained. Table 3 provides additional detail about possible local dependence. Based on both quantitative and qualitative inspection of items, the following items were removed due to their similarity with others: Items 4 and 5 (local dependence with item 3), item 8 (local dependence with item 7), item 45 (local dependence with item 12), item 22 (local dependence with item 27), items 24 and 25 (local dependence with item 23), item 32 (local dependence with item 29), item 31 (local dependence with item 30), and item 40 (local dependence with item 39). While there are other possible removals, these items represented the most overlap in both statistical and content terms.
Code
ld$localdep_stats %>%# Remove additional statisticselect(-JSI) %>%# Filter out unconcerning itemsfilter(abs(Q3) > .34& ( p_X2 < .05| p_G2 < .05) ) %>%gt(groupname_col ="Item_A", rowname_col ="Item_B") %>%tab_spanner(label ="Correlations", columns =c(3,4)) %>%tab_spanner(label ="X2 Statistics", columns =c(6,7)) %>%tab_spanner(label ="G2 Statistics", columns =c(8,9)) %>%tab_header(title ="Table 3: Inspection of Local Dependence between Similar Items")
Table 3: Inspection of Local Dependence between Similar Items
After removing items which showed excessive local dependence, bartlett’s sphericity and Kaiser-Meyer-Olkin were checked again, with Kaiser Meyer-Olkin = 0.9489385, and Barlett’s sphericity \(\chi^2\)(595) = 5557.3157536, p = 0.
Exploratory Factor Analysis
Parallel Analysis
Code
fa.para <-fa.parallel(hr1,fm ="ml", cor ="poly")
Parallel analysis suggests that the number of factors = 4 and the number of components = 2
An initial factor analysis revealed that a four factor solution provided poor fit for the data, with factor 1 explaining 41% of the total variance (eigenvalue = 14.57), factor 2 explaining 8% of the total variance (eigenvalue = 2.8), factor 3 explaining 3% of the total variance (eigenvalue = 1.05), and factor 4 explaining 2.5% of the total variance (eigenvalue = .90). This model had the following model fit indices: \(\chi^2\)(461) = 1007.56, p = 8.4970109^{-43}; CLI = 0.911; TLI = 0.885; RMSEA = 0.066. In addition to poor fit, there were numerous cross-loadings which exceeded the limit of difference of .2. Table 4 provides full factor loadings.
Based on initial factor loadings, items were iteratively removed and the structure was reduced from 4 factors to 3. In total, 13 items were removed (see Table 4), which resulted in a 22 item, 3 factor structure with adequate factor loadings and cross-loadings. This model had the following model fit indices: \(\chi^2\)(102) = 139.19, p = 0.009; CLI = 0.985; TLI = 0.978; RMSEA = 0.036. In this solution, factor 1 explained 47% of the total variance (eigenvalue = 10.45), factor 2 explained 12% of the total variance (eigenvalue = 2.74), and factor 3 explained 5% of the total variance (eigenvalue = 1.09). This model shows acceptable fit, and was chosen as the final solution for this study.
Scale Inspection
The first factor shows good reliability, with alpha, split half, and greatest lower bound of reliability greater than .93. The second factor has acceptable split half and internal reliability, but it’s greatest lower bound is very low. Finally, the third factor has split half reliability, internal consistency, and greatest lower bound between .7 and .8, which indicates acceptable, if low reliability. Based on these statistics, the first factor provides the best measurement of the construct of interest, while factors 2 and 3 present some concern.
Once reliability for the final factor solution had been inspected, item responses were investigated. For factor 1, distribution of responses across response options showed that the lowest response options (1, “Strongly Disagree” & 2, “Disagree”) were underused for many items, except those which loaded negatively (i.e., reverse coded), like the final item in factor 1, in which the highest response option (6, “Strongly Agree”) was used only 7% of the time.
Code
item <-read.csv("references/docs/items.csv")items.1<-data.frame(alpha1$response.freq) %>%mutate(item =rownames(.)) %>%left_join(item %>%select(qname, question), by =c("item"="qname")) %>%pivot_longer(c(X1:X6)) %>%mutate(Response =factor(case_when( name =="X1"~"Strongly Disagree", name =="X2"~"Disagree", name =="X3"~"Somewhat Disagree", name =="X4"~"Somewhat Agree", name =="X5"~"Agree", name =="X6"~"Strongly Agree" ),levels =c("Strongly Agree","Agree","Somewhat Agree","Somewhat Disagree","Disagree","Strongly Disagree"))) %>%ggplot() +aes(x = question, y = value, fill = Response) +geom_bar(stat ="identity", position ="stack") +geom_text(aes(label =paste0(round(value*100,0),"%")), position =position_stack(vjust = .5),size =8, color ="white") + ggokabeito::scale_fill_okabe_ito(breaks =c("Strongly Disagree","Disagree","Somewhat Disagree","Somewhat Agree","Agree","Strongly Agree" )) +coord_flip() + ggforce::facet_col(facets =vars(question), scales ="free_y", space ="free",labeller =label_wrap_gen(width =100) ) +labs(title ="Factor 1") +theme(axis.title =element_blank(),strip.background =element_blank(),legend.position ="top",axis.ticks =element_blank(),axis.text =element_blank(),panel.background =element_blank(),strip.text =element_text(size =15, vjust =-.2),panel.spacing =unit(-.5, "lines"),legend.title =element_blank(),strip.clip ="off",plot.title =element_text(hjust =0.5, size =30),legend.text =element_text(size =10) )## Scale 2items.2<-data.frame(alpha2$response.freq) %>%mutate(item =rownames(.)) %>%left_join(item %>%select(qname, question), by =c("item"="qname")) %>%pivot_longer(c(X1:X6)) %>%mutate(Response =factor(case_when( name =="X1"~"Strongly Disagree", name =="X2"~"Disagree", name =="X3"~"Somewhat Disagree", name =="X4"~"Somewhat Agree", name =="X5"~"Agree", name =="X6"~"Strongly Agree" ),levels =c("Strongly Agree","Agree","Somewhat Agree","Somewhat Disagree","Disagree","Strongly Disagree"))) %>%ggplot() +aes(x = question, y = value, fill = Response) +geom_bar(stat ="identity", position ="stack") +geom_text(aes(label =paste0(round(value*100,0),"%")), position =position_stack(vjust = .5),size =8, color ="white") + ggokabeito::scale_fill_okabe_ito(breaks =c("Strongly Disagree","Disagree","Somewhat Disagree","Somewhat Agree","Agree","Strongly Agree" )) +coord_flip() + ggforce::facet_col(facets =vars(question), scales ="free_y", space ="free",labeller =label_wrap_gen(width =100) ) +labs(title ="Factor 2") +theme(axis.title =element_blank(),strip.background =element_blank(),legend.position ="top",axis.ticks =element_blank(),axis.text =element_blank(),panel.background =element_blank(),strip.text =element_text(size =10),panel.spacing =unit(-.5, "lines"),legend.title =element_blank(),strip.clip ="off",plot.title =element_text(hjust =0.5, size =20) )## Scale 3items.3<-data.frame(alpha3$response.freq) %>%mutate(item =rownames(.)) %>%left_join(item %>%select(qname, question), by =c("item"="qname")) %>%pivot_longer(c(X1:X6)) %>%mutate(Response =factor(case_when( name =="X1"~"Strongly Disagree", name =="X2"~"Disagree", name =="X3"~"Somewhat Disagree", name =="X4"~"Somewhat Agree", name =="X5"~"Agree", name =="X6"~"Strongly Agree" ),levels =c("Strongly Agree","Agree","Somewhat Agree","Somewhat Disagree","Disagree","Strongly Disagree" ))) %>%ggplot() +aes(x = question, y = value, fill = Response) +geom_bar(stat ="identity", position ="stack") +geom_text(aes(label =paste0(round(value*100,0),"%")), position =position_stack(vjust = .52),size =3, color ="white") + ggokabeito::scale_fill_okabe_ito(breaks =c("Strongly Disagree","Disagree","Somewhat Disagree","Somewhat Agree","Agree","Strongly Agree" )) +coord_flip() + ggforce::facet_col(facets =vars(question), scales ="free_y", space ="free",labeller =label_wrap_gen(width =100) ) +labs(title ="Factor 3") +theme(axis.title =element_blank(),strip.background =element_blank(),legend.position ="top",axis.ticks =element_blank(),axis.text =element_blank(),panel.background =element_blank(),strip.text =element_text(size =10),panel.spacing =unit(-.5, "lines"),legend.title =element_blank(),strip.clip ="off",plot.title =element_text(hjust =0.5, size =20) )items.1
Figure 1: Study 1, Factor 1 Response Distribution
The same was true for both Factors 2 and 3, which showed that extreme response options were underutilized across most items (See Figures 2 & 3).
Code
items.2items.3
Figure 2: Study 1, Factor 2 Response Distribution
Figure 3: Study 1, Factor 3 Response Distribution
As a result of this response option utilization, the decision was made to reduce from 6 options to 5 for future data collection efforts. The final factor solution was deemed acceptable for confirmatory studies.
Study 2
Study 2 was preregistered, which can be accessed at https://osf.io/t5e2a. In this study, we conducted confirmatory factor analysis using new data based on the factor structure in Study 1. The confirmatory factor analysis was fit using weighted least squares estimation treating categories as ordinal rather than numeric, which was a deviation from the preregistration. We will use Hu & Bentler’s (2009) cut offs of .95 for TLI and CFI, and .06 for RMSEA.
Participants
Through Prolific, we recruited 402 participants for Study 2. These participants were 49% men (n = 196), 71% White (n = 282), and most had completed a bachelor’s degree (37%, n = 150), and 90% (n = 362) reported no history of SUD diagnosis. Table 6 provides detail about the participants.
Code
df <-read.csv("data/clean/20231020_hr-scale-confirmatory-data.csv")demos <- df %>%select(age, gid1, race, ed, sud_hx)demos %>%tbl_summary(missing ="no",statistic =list(age ="{mean} ({sd})"),digits =list(age ~c(2,2)),sort =list(everything() ~"frequency"),label =list( age ~"Age", gid1 ~"Gender Identity", race ~"Race/Ethnicity", ed ~"Highest Level of Education", sud_hx ~"History of Substance Use Disorder Diagnosis") ) %>%as_gt() %>%tab_header(title ="Table 6: Study 2 Participants")
Table 6: Study 2 Participants
Characteristic
N = 4091
Age
36.45 (12.35)
Gender Identity
Woman
196 (49%)
Man
191 (48%)
Non-Binary, Agender, or Other
13 (3.3%)
Race/Ethnicity
White
282 (71%)
Asian or Asian American
41 (10%)
Black or African American
33 (8.3%)
Latine or Hispanic
18 (4.5%)
Multiracial
17 (4.3%)
Other or Prefer not to say
8 (2.0%)
Highest Level of Education
Bachelor’s Degree
150 (37%)
High School
122 (30%)
Master’s Degree
50 (12%)
Associate’s Degree
44 (11%)
Other option not represented here
11 (2.7%)
Professional Degree (J.D., M.D., etc.)
11 (2.7%)
Doctorate Degree (PhD, PsyD, etc.)
5 (1.2%)
Less than High School
4 (1.0%)
Prefer not to say
4 (1.0%)
History of Substance Use Disorder Diagnosis
No
362 (90%)
Yes
27 (6.7%)
Prefer not to say
8 (2.0%)
I don’t know
5 (1.2%)
1 Mean (SD); n (%)
Confirmator Factory Analysis
The model from Study 1 was fit using confirmatory factor analysis using a weighted least squares estimator and ordinal indicators. This model showed moderate fit based on the selected fit statistics: \(\chi^2\)(206) = 1033.11, p > 0.001; CFI = .97; TLI = .97; SRMR = .27; RMSEA = .10. Based on CFI and TLI, the model had good fit. Based on SRMR and RMSEA, the model shows poor fit. Figure 4 provides factor loadings and covariances for the model.
Reliability is an issue with this model, as the first factor shows continued adequate reliability, but factors 2 and 3 continue to have poorer reliability. Additionally, reliability would be improved by dropping items, which would shorten factor 2, and make it only 3 items, if not fewer.
Based on poor model fit, this same data set was used to conduct a second exploratory factor analysis using a maximum likelihood estimator and an oblimin rotation. This exploratory study was not preregistered. The items which showed initial local dependence were reinspected. In this study, items 8, 24, 22, and 32 showed acceptable levels of local dependence, and were retained. After removing items which showed excessive local dependence, bartlett’s sphericity and Kaiser-Meyer-Olkin were checked again, with Kaiser Meyer-Olkin = 0.9524033, and Barlett’s sphericity \(\chi^2\)(703) = 7757.0517211, p = 0.
An initial factor analysis revealed that a four factor solution provided poor fit for the data, with factor 1 explaining 41% of the total variance (eigenvalue = 14.57), factor 2 explaining 8% of the total variance (eigenvalue = 2.8), factor 3 explaining 3% of the total variance (eigenvalue = 1.05), and factor 4 explaining 2.5% of the total variance (eigenvalue = .90). This model had the following model fit indices: \(\chi^2\)(557) = 1736.37, p = 2.9756724^{-121}; CLI = 0.873; TLI = 0.839; RMSEA = 0.076. In addition to poor fit, there were numerous cross-loadings which exceeded the limit of difference of .2. Table 6 provides full factor loadings.
After removing items with no primary loading or close cross loadings, a three factor solution provided the best fit. At this point, factor 3 consisted of 6 items which explained only 22% of the total variance and had 4 factor loadings less that .55. Between these quantitative reasons and inspecting the qualitative content of the items, they were removed. These were items 2, 9, 28, 29, and 32. The content of these items focused primarily on treatment aspects of harm reduction, where were less directly relevant to constructs of interest.
The resulting model was a two factor model which consisted of 15 items. Factor 1 explained 59% of the variance, and Factor 2 explained 41% of the variance. This model had the following model fit indices: \(\chi^2\)(89) = 305.23, p = 1.7768727^{-25}; CLI = 0.952; TLI = 0.936; RMSEA = 0.079, which indicated acceptable fit. Due to the low loading in this study, item “q39” is a candidate for removal following the next round of data collection.
People who use drugs should have access to safe injection supplies (sterile needles and syringes).
People who inject drugs should be able to do so in a way that prevents them from causing further harm to their health.
People who use drugs should have access to tools to test what's in their drugs.
People who use drugs should have access to supervised places where they can consume drugs safely.
People who use drugs should have access to a legal, non-contaminated drug supply.
People should be able to use drugs safely.
Racism affects the health of people who use drugs.
People who seek medical assistance for overdoses should be protected from drug charges, arrests, and prosecutions.
Possession of "drug paraphernalia", like syringes and pipes, should be legal.
Factor 2
It is possible to live a healthy life without stopping drug use.
Drugs make the world worse.
*
Drug use has benefits.
Using drugs is immoral.
*
People who use drugs benefit society.
People who use drugs should be forced into treatment.
*
People who use drugs will naturally end up homeless.
*
* indicates reverse coded items
Scale Inspection
The first factor shows good reliability, with alpha, split half, and greatest lower bound of reliability greater than .93. The second factor has acceptable split half and internal reliability, but it’s greatest lower bound is very low, which continues to be a concern. Overall, between the number of items, EFA fit statistics, and reliability, this model provides better fit and output compared to the initial model.
Both factors show a similar distribution to in the initial study. Some more extreme response options show under use, but across all items response options are used.
Code
data.frame(alpha1$response.freq) %>%mutate(item =rownames(.)) %>%left_join(item %>%select(qname, question), by =c("item"="qname")) %>%pivot_longer(c(X1:X5)) %>%mutate(Response =factor(case_when( name =="X1"~"Strongly disagree", name =="X2"~"Somewhat disagree", name =="X3"~"Neither agree nor disagree", name =="X4"~"Somewhat agree", name =="X5"~"Strongly agree" ),levels =c("Strongly disagree", "Somewhat disagree","Neither agree nor disagree","Somewhat agree","Strongly agree"))) %>%ggplot() +aes(x = question, y = value, fill = Response) +geom_bar(stat ="identity", position ="stack") +geom_text(aes(label =paste0(round(value*100,0),"%")), position =position_stack(vjust = .5),size =3) +coord_flip() + ggforce::facet_col(facets =vars(question), scales ="free_y", space ="free",labeller =label_wrap_gen(width =100) ) +labs(title ="Factor 1") +theme(axis.title =element_blank(),strip.background =element_blank(),legend.position ="top",axis.ticks =element_blank(),axis.text =element_blank(),panel.background =element_blank(),strip.text =element_text(size =10),panel.spacing =unit(-.5, "lines"),legend.title =element_blank(),strip.clip ="off",plot.title =element_text(hjust =0.5, size =20) )data.frame(alpha2$response.freq) %>%mutate(item =rownames(.)) %>%left_join(item %>%select(qname, question), by =c("item"="qname")) %>%pivot_longer(c(X1:X5)) %>%mutate(Response =factor(case_when( name =="X1"~"Strongly disagree", name =="X2"~"Somewhat disagree", name =="X3"~"Neither agree nor disagree", name =="X4"~"Somewhat agree", name =="X5"~"Strongly agree"),levels =c("Strongly disagree", "Somewhat disagree","Neither agree nor disagree","Somewhat agree","Strongly agree"))) %>%ggplot() +aes(x = question, y = value, fill = Response) +geom_bar(stat ="identity", position ="stack") +geom_text(aes(label =paste0(round(value*100,0),"%")), position =position_stack(vjust = .5),size =3) +coord_flip() + ggforce::facet_col(facets =vars(question), scales ="free_y", space ="free",labeller =label_wrap_gen(width =100) ) +labs(title ="Factor 2") +theme(axis.title =element_blank(),strip.background =element_blank(),legend.position ="top",axis.ticks =element_blank(),axis.text =element_blank(),panel.background =element_blank(),strip.text =element_text(size =10),panel.spacing =unit(-.5, "lines"),legend.title =element_blank(),strip.clip ="off",plot.title =element_text(hjust =0.5, size =20) )
Figure 5: Study 2 EFA, Factor 1 Response Distribution
Figure 6: Study 2 EFA, Factor 2 Response Distribution
Study 3
Study 3 was preregistered, which can be accessed at https://osf.io/q4uzv. In this study, we conducted confirmatory factor analysis using new data based on the factor structure in Study 2. The confirmatory factor analysis was fit using weighted least squares estimation treating categories as ordinal rather than numeric, which was a deviation from the preregistration. We will use Hu & Bentler’s (2009) cut offs of .95 for TLI and CFI, and .06 for RMSEA.
Participants
Through Prolific, we recruited 402 participants for Study 2. These participants were 52% men (n = 205), 70% White (n = 277), and most had completed a bachelor’s degree (41%, n = 161), and 88% reported no history of SUD diagnosis. Table 6 provides detail about the participants.
Code
df <-read.csv("data/clean/20231025_hr-scale-confirmatory-data2.csv")demos <- df %>%select(age, gid1, race, ed, sud_hx)demos %>%tbl_summary(missing ="no",statistic =list(age ="{mean} ({sd})"),digits =list(age ~c(2,2)),sort =list(everything() ~"frequency"),label =list( age ~"Age", gid1 ~"Gender Identity", race ~"Race/Ethnicity", ed ~"Highest Level of Education", sud_hx ~"History of Substance Use Disorder Diagnosis") ) %>%as_gt() %>%tab_header(title ="Table 11: Study 3 Participants")
Table 11: Study 3 Participants
Characteristic
N = 3931
Age
37.03 (11.59)
Gender Identity
Man
205 (52%)
Woman
174 (44%)
Non-Binary, Agender, or Other
14 (3.6%)
Race/Ethnicity
White
277 (70%)
Asian or Asian American
41 (10%)
Black or African American
35 (8.9%)
Latine or Hispanic
28 (7.1%)
Multiracial
12 (3.1%)
Highest Level of Education
Bachelor’s Degree
161 (41%)
High School
129 (33%)
Associate’s Degree
46 (12%)
Master’s Degree
34 (8.7%)
Professional Degree (J.D., M.D., etc.)
9 (2.3%)
Other option not represented here
5 (1.3%)
Doctorate Degree (PhD, PsyD, etc.)
4 (1.0%)
Less than High School
3 (0.8%)
History of Substance Use Disorder Diagnosis
No
346 (88%)
Yes
40 (10%)
I don’t know
5 (1.3%)
Prefer not to say
2 (0.5%)
1 Mean (SD); n (%)
Confirmatory Factor Analysis
The model developed in study 2 was fit to study 3’s data using confirmatory factor analysis with least square estimation and ordingal indicators. The model showed acceptablve fit on most indicators: \(\chi^2\) (103) = 314.68, p < .001; CFI = .98, TLI = .97; RMSEA = .07; SRMR = .19. Figure 7 provides factor loadings and covariances for the model.
The first factor shows good reliability, with alpha, split half, and greatest lower bound of reliability greater than .93. The second factor has acceptable split half and internal reliability, but it’s greatest lower bound is very low, which continues to be a concern. Overall, between the number of items, EFA fit statistics, and reliability, this model provides better fit and output compared to the initial model.
Table 11 provides detail about mean, median, and standard deviation for both subscales. The Strategies subscale may show some ceiling effect, as scores are higher above the mean, while the Principles subscale shows normal, but slightly skewed distribution. The two subscales show little to no relationship (r = -0.07, 95% CI -0.16, 0.03).
Code
gtExtras::gt_plt_summary(meanscores[60:61],title ="Table 11: Mean Score for Both Factors")
Table 11: Mean Score for Both Factors
393 rows x 2 cols
Column
Plot Overview
Missing
Mean
Median
SD
Strategies
0.0%
3.7
3.8
1.0
Principles
0.0%
2.7
2.7
0.4
Item Response Distribution
Both factors show a similar distribution to in the initial study. Some more extreme response options show under use, but across all items response options are used.
Code
items1 <-data.frame(alpha1$response.freq) %>%mutate(item =rownames(.)) %>%left_join(items %>%select(var_label, item), by =c("item"="var_label")) %>%pivot_longer(c(X1:X5)) %>%mutate(Response =factor(case_when( name =="X1"~"Strongly disagree", name =="X2"~"Somewhat disagree", name =="X3"~"Neither agree nor disagree", name =="X4"~"Somewhat agree", name =="X5"~"Strongly agree" ),levels =c("Strongly agree","Somewhat agree","Neither agree nor disagree","Somewhat disagree","Strongly disagree"))) %>%ggplot() +aes(x = item.y, y = value, fill = Response) +geom_bar(stat ="identity", position ="stack") +geom_text(aes(label =paste0(round(value*100,0),"%")), position =position_stack(vjust = .5),size =3) + ggokabeito::scale_fill_okabe_ito(breaks =c("Strongly disagree","Somewhat disagree","Neither agree nor disagree","Somewhat agree","Strongly agree" )) +coord_flip() + ggforce::facet_col(facets =vars(item.y), scales ="free_y", space ="free",labeller =label_wrap_gen(width =100) ) +labs(title ="Harm Reduction Strategies") +theme(axis.title =element_blank(),strip.background =element_blank(),legend.position ="top",axis.ticks =element_blank(),axis.text =element_blank(),panel.background =element_blank(),strip.text =element_text(size =16),panel.spacing =unit(-.5, "lines"),legend.title =element_blank(),strip.clip ="off",plot.title =element_text(hjust =0.5, size =20) )items2 <-data.frame(alpha2$response.freq) %>%mutate(item =rownames(.)) %>%left_join(items %>%select(var_label, item), by =c("item"="var_label")) %>%pivot_longer(c(X1:X5)) %>%mutate(Response =factor(case_when( name =="X1"~"Strongly disagree", name =="X2"~"Somewhat disagree", name =="X3"~"Neither agree nor disagree", name =="X4"~"Somewhat agree", name =="X5"~"Strongly agree" ),levels =c("Strongly agree", "Somewhat agree", "Neither agree nor disagree", "Somewhat disagree", "Strongly disagree"))) %>%ggplot() +aes(x = item.y, y = value, fill = Response) +geom_bar(stat ="identity", position ="stack") +geom_text(aes(label =paste0(round(value*100,0),"%")), position =position_stack(vjust = .5),size =3) + ggokabeito::scale_fill_okabe_ito(breaks =c("Strongly disagree","Somewhat disagree","Neither agree nor disagree","Somewhat agree","Strongly agree" )) +coord_flip() + ggforce::facet_col(facets =vars(item.y), scales ="free_y", space ="free",labeller =label_wrap_gen(width =100) ) +labs(title ="Harm Reduction Principles") +theme(axis.title =element_blank(),strip.background =element_blank(),legend.position ="top",axis.ticks =element_blank(),axis.text =element_blank(),panel.background =element_blank(),strip.text =element_text(size =16),panel.spacing =unit(-.5, "lines"),legend.title =element_blank(),strip.clip ="off",plot.title =element_text(hjust =0.5, size =20) )
Finally, figure 10 shows mean score distributions for both scales in the current sample. As noted above, the Strategies subscale may show some ceiling effect, while the Principles subscale does not. While these two separate subscales need additional evidence, this factor structure for these 16 items demonstrates acceptable reliability and structural validity.
---title: "Harm Reduction Development and Validation"author: "Zach Budesa"format: html: code-fold: true code-tools: true code-overflow: wrap toc: trueexecute: warning: false message: false---# Item GenerationItem generation was undertaken in several stages. The first step in item generation consisted of items generated in two scales (harm reduction principles and harm reduction strategies) which were drafted by the authors and piloted with harm reduction workers in local training and harm reduction supply distribution evaluation. Following the initial pilot project, we revised and added items to address gaps that had been noted. The items were generated with regard to both the SAMHSA Harm Reduction Framework and the National Harm Reduction Coalition's Harm Reduction Principles(<https://harmreduction.org/about-us/principles-of-harm-reduction/>). Once the initial item pool was generated, we consulted with Dr. David Frank, a harm reduction expert and researcher. Based on his consultation, items were further modified to be accessible to a general population.We used Wolf et al's (2023) Response Process Evaluation (RPE) method to ensure that a non-expert, non-harm reduction audience would understand the item content. Over X rounds of data collection, we used Prolific to collect qualitative responses to each item. As a group, we evaluated each set of responses to determine whether respondents understood or did not understand the intended meaning of each item. Based on respondent answers and suggestions, we identified items which items were generally understood, and which items should be modified or removed. The final item pool contained 45 items related to harm reduction strategies and principles. We phrased items as simply and directly as possible, occasionally splitting complex topics (experiences of discrimination, other topic) across multiple items when necessary. For study 1, items were answered on a 6-point ordinal scale from 1 (Strongly Disagree) to 6 (Strongly Agree). Following inspection of item responses, studies 2 and 3 used 5-point ordinal scales from 1 (Strongly Disagree) to 5 (Strongly Agree). Participants answered items in a random order, and completed a brief demographic survey.## Items```{r}library(gt)library(tidyverse)items <- readr::read_csv("references/codebooks/20231011_hr-scale-codebook.csv")items <- items %>%filter(grepl("q", var_label)) %>%mutate(item =str_remove_all(item, pattern ='\r\n')) %>%select(-1)items %>%select(-var_label) %>%gt() %>%tab_header(title ="Table 1: Initial Item Set" ) %>%cols_label(item ="Item" )```# Study 1Study 1 was preregistered, which can be accessed at <https://osf.io/yt7sm>. In this study, we conducted exploratory factor analysis using an oblimin rotation based on a polychoric correlation matrix using maximum likelihood estimation for factor extraction. We used the 40-30-20 rule to determine factor loading cut off, which means that retained items should load on their primary factor above 0.4, load onto alternative factors below 0.3, and show a difference of 0.2 between primary and alternative factor loadings. Finally, scale reliability will be assessed using Cronbachs' Alpha, the Greatest Lower Bound, and average split half reliability.## Participants```{r}# Librarieslibrary(psych)library(correlation)library(EFA.dimensions)library(GPArotation)# Import Datadf <-read.csv("data/clean/20231011_hr-scale-exploratory-data.csv")# Build Demographic Data Setdemos <- df %>%select(age, gid1, race, ed, sud_hx)# Build Data Set of Itemshr <- df %>%# N = 301select(starts_with("q"), -Q_RecaptchaScore)```Through Prolific, we recruited 301 participants for Study 1. These participants were 50% men (n = 149), 70% White (n = 210), 41% had completed a bachelor's degree (n = 122), 88% reported no history of a substance use disorder (SUD) diagnosis, and had an average age of 36. Table 2 provides additional detail about this sample.```{r}library(gtsummary)demos %>%tbl_summary(missing ="no",statistic =list(age ="{mean} ({sd})"),digits =list(age ~c(2,2)),sort =list(everything() ~"frequency"),label =list( age ~"Age", gid1 ~"Gender Identity", race ~"Race/Ethnicity", ed ~"Highest Level of Education", sud_hx ~"History of Substance Use Disorder Diagnosis") ) %>%as_gt() %>%tab_header(title ="Table 2: Study 1 Participants")```## Initial Data Inspection```{r}bart <-cortest.bartlett(hr)kmo <-KMO(hr)```Using a polychoric correlation matrix, we analyzed Bartlett's Sphericity and the Kaiser-Meyer-Olkin test to ensure that the data were appropriate for factor analysis. Our data appeared adequate for factor analysis, with Kaiser Meyer-Olkin = `r kmo$MSA`, and Barlett's sphericity $\chi^2$(`r bart$df`) = `r bart$chisq`, p = `r bart$p.value`. Inspection of the correlation matrix showed many inter-item correlations \> .40, indicating that a factor solution should be possible.### Local Dependence Between Items```{r}# Saved because LOCALDEP takes forever to run# ld <- LOCALDEP(hr)# saveRDS(ld, "study1-local-dependence.rds")ld <-readRDS("study1-local-dependence.rds")```Initial inspection did reveal excessively high inter-item correlations that caused concern. First, there were 14 inter-item correlations which were \> .70, and 2 partial correlations \> .60. Using the EFA.dimensions package, inspection of possible local dependence was done via Q3, X2, and and G2 statistics. Possible locally dependent items were inspected pair-by-pair. Those that were highly similar were removed, while those that appeared statistically locally dependent but different content wise were retained. Table 3 provides additional detail about possible local dependence. Based on both quantitative and qualitative inspection of items, the following items were removed due to their similarity with others: Items 4 and 5 (local dependence with item 3), item 8 (local dependence with item 7), item 45 (local dependence with item 12), item 22 (local dependence with item 27), items 24 and 25 (local dependence with item 23), item 32 (local dependence with item 29), item 31 (local dependence with item 30), and item 40 (local dependence with item 39). While there are other possible removals, these items represented the most overlap in both statistical and content terms.```{r}ld$localdep_stats %>%# Remove additional statisticselect(-JSI) %>%# Filter out unconcerning itemsfilter(abs(Q3) > .34& ( p_X2 < .05| p_G2 < .05) ) %>%gt(groupname_col ="Item_A", rowname_col ="Item_B") %>%tab_spanner(label ="Correlations", columns =c(3,4)) %>%tab_spanner(label ="X2 Statistics", columns =c(6,7)) %>%tab_spanner(label ="G2 Statistics", columns =c(8,9)) %>%tab_header(title ="Table 3: Inspection of Local Dependence between Similar Items")hr1 <- hr %>%select(-c(q4, q5, q8, q22, q24, q25, q32, q31, q40, q45))bart <-cortest.bartlett(hr1)kmo <-KMO(hr1)```After removing items which showed excessive local dependence, bartlett's sphericity and Kaiser-Meyer-Olkin were checked again, with Kaiser Meyer-Olkin = `r kmo$MSA`, and Barlett's sphericity $\chi^2$(`r bart$df`) = `r bart$chisq`, p = `r bart$p.value`.## Exploratory Factor Analysis### Parallel Analysis```{r}fa.para <-fa.parallel(hr1,fm ="ml", cor ="poly")```### Model 1```{r}#| output: false# Initial Factor Analysisefa <- EFA.dimensions::EFA(hr1,extraction ="ml", Nfactors =4,corkind ="polychoric",rotation ="oblimin",verbose =FALSE)initial <- efa$pattern %>%as_tibble(rownames ="q") %>%left_join(items,by =c("q"="var_label")) %>%select(q, item, 2:5) %>%gt() %>%tab_style(locations =list(# Factor 1cells_body(columns =`Factor 1`,rows =abs(`Factor 1`) < .30 ),# Factor 2cells_body(columns =`Factor 2`,rows =abs(`Factor 2`) < .30 ),# Factor 3cells_body(columns =`Factor 3`,rows =abs(`Factor 3`) < .30 ),# Factor 4cells_body(columns =`Factor 4`,rows =abs(`Factor 4`) < .30 )),style =list(cell_text(color ='gray'))) %>%fmt_number(columns =starts_with("Factor"),decimals =3) ```An initial factor analysis revealed that a four factor solution provided poor fit for the data, with factor 1 explaining 41% of the total variance (eigenvalue = 14.57), factor 2 explaining 8% of the total variance (eigenvalue = 2.8), factor 3 explaining 3% of the total variance (eigenvalue = 1.05), and factor 4 explaining 2.5% of the total variance (eigenvalue = .90). This model had the following model fit indices: $\chi^2$(`r efa$dfMODEL`) = `r round(efa$chisqMODEL, 2)`, p = `r efa$pvalue`; CLI = `r round(efa$fit_coefs$CFI, 3)`; TLI = `r round(efa$fit_coefs$TLI, 3)`; RMSEA = `r round(efa$fit_coefs$RMSEA, 3)`. In addition to poor fit, there were numerous cross-loadings which exceeded the limit of difference of .2. Table 4 provides full factor loadings.```{r}#| output: false# New Matrixhr2 <- hr1 %>%select(-c(q17, q12, q43, q15, q35, q37, q26, q34, q44, q20, q11, q28, q14, q27, q30, q3, q13))efa.final <- EFA.dimensions::EFA(hr2,extraction ="ml", Nfactors =3, Ncases =301,rotation ="oblimin",verbose =FALSE)final <- efa.final$pattern %>%as_tibble(rownames ="q") %>%left_join(items,by =c("q"="var_label")) %>%select(q, item, 2:5) %>%gt() %>%tab_style(locations =list(# Factor 1cells_body(columns =`Factor 1`,rows =abs(`Factor 1`) < .30),# Factor 2cells_body(columns =`Factor 2`,rows =abs(`Factor 2`) < .30 ),# Factor 3cells_body(columns =`Factor 3`,rows =abs(`Factor 3`) < .30 )),style =list(cell_text(color ='gray'))) %>%fmt_number(columns =starts_with("Factor"),decimals =3) left_join( initial$`_data`, final$`_data`,by =c("q"="q" )) %>%arrange(factor(q,levels =c(# Factor 1"q1", "q18", "q7", "q19", "q21", "q6", "q16", "q39",# Factor 2"q42", "q36", "q38", "q33", "q41",# Factor 3"q9", "q2", "q23", "q10", "q29"))) %>% janitor::clean_names() %>%select(-q, -item_y) %>%gt() %>%tab_style(locations =list(# Factor 1cells_body(columns = factor_1_x,rows =abs(factor_1_x) < .30),# Factor 2cells_body(columns = factor_2_x,rows =abs(factor_2_x) < .30 ),# Factor 3cells_body(columns = factor_3_x,rows =abs(factor_3_x) < .30 ),# Factor 4cells_body(columns = factor_4,rows =abs(factor_4) < .30 ),# Factor 1 - Finalcells_body(columns = factor_1_y,rows =abs(factor_1_y) < .30),# Factor 2 - Finalcells_body(columns = factor_2_y,rows =abs(factor_2_y) < .30 ),# Factor 3 - Finalcells_body(columns = factor_3_y,rows =abs(factor_3_y) < .30 )),style =list(cell_text(color ='gray'))) %>%fmt_number(columns =starts_with("Factor"),decimals =3) %>%fmt_missing() %>%tab_header(title ="Table 4: Initial and Final Factor Structure" ) %>%tab_spanner(label ="Initial 4 Factor Solution",columns =2:5 ) %>%tab_spanner(label ="Final 3 Factor Solution",columns =6:8 ) %>%cols_label(item_x ="Item", factor_1_x ="Factor 1",factor_2_x ="Factor 2",factor_3_x ="Factor 3",factor_4 ="Factor 4",factor_1_y ="Factor 1",factor_2_y ="Factor 2",factor_3_y ="Factor 3" )```### Model 2Based on initial factor loadings, items were iteratively removed and the structure was reduced from 4 factors to 3. In total, 13 items were removed (see Table 4), which resulted in a 22 item, 3 factor structure with adequate factor loadings and cross-loadings. This model had the following model fit indices: $\chi^2$(`r efa.final$dfMODEL`) = `r round(efa.final$chisqMODEL, 2)`, p = `r round(efa.final$pvalue, 3)`; CLI = `r round(efa.final$fit_coefs$CFI, 3)`; TLI = `r round(efa.final$fit_coefs$TLI, 3)`; RMSEA = `r round(efa.final$fit_coefs$RMSEA, 3)`. In this solution, factor 1 explained 47% of the total variance (eigenvalue = 10.45), factor 2 explained 12% of the total variance (eigenvalue = 2.74), and factor 3 explained 5% of the total variance (eigenvalue = 1.09). This model shows acceptable fit, and was chosen as the final solution for this study.## Scale Inspection::: columns::: {.column width="35%"}The first factor shows good reliability, with alpha, split half, and greatest lower bound of reliability greater than .93. The second factor has acceptable split half and internal reliability, but it's greatest lower bound is very low. Finally, the third factor has split half reliability, internal consistency, and greatest lower bound between .7 and .8, which indicates acceptable, if low reliability. Based on these statistics, the first factor provides the best measurement of the construct of interest, while factors 2 and 3 present some concern. :::::: {.column width="10%"}<!-- Empty -->:::::: {.column width="55%"}```{r}scale1 <- df %>%select(q1, q18, q7, q19, q21, q6, q16, q39)scale2 <- df %>%select(q42, q36, q38, q33, q41)scale3 <- df %>%select(q9, q2, q23, q10, q29)splithalf1 <-splitHalf(scale1) ; splithalf2 <-splitHalf(scale2) ; splithalf3 <-splitHalf(scale3) alpha1 <- psych::alpha(scale1, check.keys =TRUE) ; alpha2 <- psych::alpha(scale2, check.keys =TRUE) ; alpha3 <- psych::alpha(scale3, check.keys =TRUE) glb1 <-glb.algebraic(cor(scale1, use ="pairwise.complete.obs")) ; glb2 <-glb.algebraic(cor(scale2, use ="pairwise.complete.obs")) ; glb3 <-glb.algebraic(cor(scale3, use ="pairwise.complete.obs")) data.frame(Factor =c("Factor 1", "Factor 1", "Factor 1", "Factor 2", "Factor 2", "Factor 2", "Factor 3", "Factor 3", "Factor 3"),`Reliability Statistic`=c("Average Split Half", "Alpha", "Greatest Lower Bound","Average Split Half", "Alpha", "Greatest Lower Bound", "Average Split Half", "Alpha", "Greatest Lower Bound"),Statistic =c(round(splithalf1$meanr,2),round(alpha1$total$raw_alpha, 2), round(glb1$glb, 2),round(splithalf2$meanr,2),round(alpha2$total$raw_alpha, 2), round(glb2$glb, 2),round(splithalf3$meanr,2),round(alpha3$total$raw_alpha, 2), round(glb3$glb, 2))) %>% gt::gt(groupname_col ="Factor") %>% gt::tab_header(title ="Table 5: Reliability Statistics For All Scales") ```::::::Once reliability for the final factor solution had been inspected, item responses were investigated. For factor 1, distribution of responses across response options showed that the lowest response options (1, "Strongly Disagree" & 2, "Disagree") were underused for many items, except those which loaded negatively (i.e., reverse coded), like the final item in factor 1, in which the highest response option (6, "Strongly Agree") was used only 7% of the time. ```{r}#| fig-width: 12#| fig-height: 12#| column: screen#| out-width: 100%#| fig-cap: "Figure 1: Study 1, Factor 1 Response Distribution"item <-read.csv("references/docs/items.csv")items.1<-data.frame(alpha1$response.freq) %>%mutate(item =rownames(.)) %>%left_join(item %>%select(qname, question), by =c("item"="qname")) %>%pivot_longer(c(X1:X6)) %>%mutate(Response =factor(case_when( name =="X1"~"Strongly Disagree", name =="X2"~"Disagree", name =="X3"~"Somewhat Disagree", name =="X4"~"Somewhat Agree", name =="X5"~"Agree", name =="X6"~"Strongly Agree" ),levels =c("Strongly Agree","Agree","Somewhat Agree","Somewhat Disagree","Disagree","Strongly Disagree"))) %>%ggplot() +aes(x = question, y = value, fill = Response) +geom_bar(stat ="identity", position ="stack") +geom_text(aes(label =paste0(round(value*100,0),"%")), position =position_stack(vjust = .5),size =8, color ="white") + ggokabeito::scale_fill_okabe_ito(breaks =c("Strongly Disagree","Disagree","Somewhat Disagree","Somewhat Agree","Agree","Strongly Agree" )) +coord_flip() + ggforce::facet_col(facets =vars(question), scales ="free_y", space ="free",labeller =label_wrap_gen(width =100) ) +labs(title ="Factor 1") +theme(axis.title =element_blank(),strip.background =element_blank(),legend.position ="top",axis.ticks =element_blank(),axis.text =element_blank(),panel.background =element_blank(),strip.text =element_text(size =15, vjust =-.2),panel.spacing =unit(-.5, "lines"),legend.title =element_blank(),strip.clip ="off",plot.title =element_text(hjust =0.5, size =30),legend.text =element_text(size =10) )## Scale 2items.2<-data.frame(alpha2$response.freq) %>%mutate(item =rownames(.)) %>%left_join(item %>%select(qname, question), by =c("item"="qname")) %>%pivot_longer(c(X1:X6)) %>%mutate(Response =factor(case_when( name =="X1"~"Strongly Disagree", name =="X2"~"Disagree", name =="X3"~"Somewhat Disagree", name =="X4"~"Somewhat Agree", name =="X5"~"Agree", name =="X6"~"Strongly Agree" ),levels =c("Strongly Agree","Agree","Somewhat Agree","Somewhat Disagree","Disagree","Strongly Disagree"))) %>%ggplot() +aes(x = question, y = value, fill = Response) +geom_bar(stat ="identity", position ="stack") +geom_text(aes(label =paste0(round(value*100,0),"%")), position =position_stack(vjust = .5),size =8, color ="white") + ggokabeito::scale_fill_okabe_ito(breaks =c("Strongly Disagree","Disagree","Somewhat Disagree","Somewhat Agree","Agree","Strongly Agree" )) +coord_flip() + ggforce::facet_col(facets =vars(question), scales ="free_y", space ="free",labeller =label_wrap_gen(width =100) ) +labs(title ="Factor 2") +theme(axis.title =element_blank(),strip.background =element_blank(),legend.position ="top",axis.ticks =element_blank(),axis.text =element_blank(),panel.background =element_blank(),strip.text =element_text(size =10),panel.spacing =unit(-.5, "lines"),legend.title =element_blank(),strip.clip ="off",plot.title =element_text(hjust =0.5, size =20) )## Scale 3items.3<-data.frame(alpha3$response.freq) %>%mutate(item =rownames(.)) %>%left_join(item %>%select(qname, question), by =c("item"="qname")) %>%pivot_longer(c(X1:X6)) %>%mutate(Response =factor(case_when( name =="X1"~"Strongly Disagree", name =="X2"~"Disagree", name =="X3"~"Somewhat Disagree", name =="X4"~"Somewhat Agree", name =="X5"~"Agree", name =="X6"~"Strongly Agree" ),levels =c("Strongly Agree","Agree","Somewhat Agree","Somewhat Disagree","Disagree","Strongly Disagree" ))) %>%ggplot() +aes(x = question, y = value, fill = Response) +geom_bar(stat ="identity", position ="stack") +geom_text(aes(label =paste0(round(value*100,0),"%")), position =position_stack(vjust = .52),size =3, color ="white") + ggokabeito::scale_fill_okabe_ito(breaks =c("Strongly Disagree","Disagree","Somewhat Disagree","Somewhat Agree","Agree","Strongly Agree" )) +coord_flip() + ggforce::facet_col(facets =vars(question), scales ="free_y", space ="free",labeller =label_wrap_gen(width =100) ) +labs(title ="Factor 3") +theme(axis.title =element_blank(),strip.background =element_blank(),legend.position ="top",axis.ticks =element_blank(),axis.text =element_blank(),panel.background =element_blank(),strip.text =element_text(size =10),panel.spacing =unit(-.5, "lines"),legend.title =element_blank(),strip.clip ="off",plot.title =element_text(hjust =0.5, size =20) )items.1```The same was true for both Factors 2 and 3, which showed that extreme response options were underutilized across most items (See Figures 2 & 3).```{r}#| column: screen#| out-width: 80%#| layout-ncol: 2#| fig-cap: #| - "Figure 2: Study 1, Factor 2 Response Distribution"#| - "Figure 3: Study 1, Factor 3 Response Distribution"items.2items.3```As a result of this response option utilization, the decision was made to reduce from 6 options to 5 for future data collection efforts. The final factor solution was deemed acceptable for confirmatory studies.# Study 2Study 2 was preregistered, which can be accessed at <https://osf.io/t5e2a>. In this study, we conducted confirmatory factor analysis using new data based on the factor structure in Study 1. The confirmatory factor analysis was fit using weighted least squares estimation treating categories as ordinal rather than numeric, which was a deviation from the preregistration. We will use Hu & Bentler's (2009) cut offs of .95 for TLI and CFI, and .06 for RMSEA.## ParticipantsThrough Prolific, we recruited 402 participants for Study 2. These participants were 49% men (n = 196), 71% White (n = 282), and most had completed a bachelor's degree (37%, n = 150), and 90% (n = 362) reported no history of SUD diagnosis. Table 6 provides detail about the participants. ```{r}df <-read.csv("data/clean/20231020_hr-scale-confirmatory-data.csv")demos <- df %>%select(age, gid1, race, ed, sud_hx)demos %>%tbl_summary(missing ="no",statistic =list(age ="{mean} ({sd})"),digits =list(age ~c(2,2)),sort =list(everything() ~"frequency"),label =list( age ~"Age", gid1 ~"Gender Identity", race ~"Race/Ethnicity", ed ~"Highest Level of Education", sud_hx ~"History of Substance Use Disorder Diagnosis") ) %>%as_gt() %>%tab_header(title ="Table 6: Study 2 Participants")```## Confirmator Factory AnalysisThe model from Study 1 was fit using confirmatory factor analysis using a weighted least squares estimator and ordinal indicators. This model showed moderate fit based on the selected fit statistics: $\chi^2$(206) = 1033.11, p > 0.001; CFI = .97; TLI = .97; SRMR = .27; RMSEA = .10. Based on CFI and TLI, the model had good fit. Based on SRMR and RMSEA, the model shows poor fit. Figure 4 provides factor loadings and covariances for the model.```{r}#| column: marginlibrary(lavaan)library(semPlot)library(semptools)hr <- df %>%# N = 301select(starts_with("q"), -Q_RecaptchaScore)# Initial Model Fitmodel <-'F1 =~ q1 + q18 + q7 + q19 + q21 + q6 + q16 + q39 F2 =~ q42 + q36 + q38 + q33 + q41 F3 =~ q9 + q2 + q23 + q10 + q29 F1 ~~ F2 F1 ~~ F3'fit <-cfa( model,data = df,ordered =TRUE,estimator ="WLS",std.lv =TRUE) fitmeasures(fit,c("cfi", "tli", "srmr", "rmsea", "chisq", "df", "pvalue"),ci =TRUE) %>%as_tibble(rownames ="Item") %>%gt() %>%tab_header(title ="Table 7: CFA Fit Statistics" ) %>%fmt_number(columns =2, decimals =2)``````{r}#| column: page#| fig-cap: "Figure 4: Study 2 CFA"plot(set_cfa_layout(semPaths(fit, whatLabels="est",sizeMan =3,node.width =1,thresholds =FALSE,edge.label.cex = .75,style ="ram",mar =c(5, 1, 5, 1),intercepts =FALSE,DoNotPlot =TRUE),fcov_curve =1.75,loading_position = .9))```## Scale ReliabilityReliability is an issue with this model, as the first factor shows continued adequate reliability, but factors 2 and 3 continue to have poorer reliability. Additionally, reliability would be improved by dropping items, which would shorten factor 2, and make it only 3 items, if not fewer.```{r}scale1 <- df %>%select(q1, q18, q7, q19, q21, q6, q16, q39)scale2 <- df %>%select(q42, q36, q38, q33, q41)scale3 <- df %>%select(q9, q2, q23, q10, q29)splithalf1 <-splitHalf(scale1) ; splithalf2 <-splitHalf(scale2) ; splithalf3 <-splitHalf(scale3) alpha1 <- psych::alpha(scale1, check.keys =TRUE) ; alpha2 <- psych::alpha(scale2, check.keys =TRUE) ; alpha3 <- psych::alpha(scale3, check.keys =TRUE) glb1 <-glb.algebraic(cor(scale1, use ="pairwise.complete.obs")) ; glb2 <-glb.algebraic(cor(scale2, use ="pairwise.complete.obs")) ; glb3 <-glb.algebraic(cor(scale3, use ="pairwise.complete.obs")) data.frame(Factor =c("Factor 1", "Factor 1", "Factor 1", "Factor 2", "Factor 2", "Factor 2", "Factor 3", "Factor 3", "Factor 3"),`Reliability Statistic`=c("Average Split Half", "Alpha", "Greatest Lower Bound","Average Split Half", "Alpha", "Greatest Lower Bound", "Average Split Half", "Alpha", "Greatest Lower Bound"),Statistic =c(round(splithalf1$meanr,2),round(alpha1$total$raw_alpha, 2), round(glb1$glb, 2),round(splithalf2$meanr,2),round(alpha2$total$raw_alpha, 2), round(glb2$glb, 2),round(splithalf3$meanr,2),round(alpha3$total$raw_alpha, 2), round(glb3$glb, 2))) %>% gt::gt(groupname_col ="Factor") %>% gt::tab_header(title ="Table 5: Reliability Statistics For Both Scales")```## Exploratory Factor Analysis```{r}hr <- df %>%# N = 301select(starts_with("q"), -Q_RecaptchaScore)hr1 <- hr %>%select(-c(q4:q5, q25, q31, q40, q45, q8))bart <-cortest.bartlett(hr1)kmo <-KMO(hr1)```Based on poor model fit, this same data set was used to conduct a second exploratory factor analysis using a maximum likelihood estimator and an oblimin rotation. This exploratory study was not preregistered. The items which showed initial local dependence were reinspected. In this study, items 8, 24, 22, and 32 showed acceptable levels of local dependence, and were retained. After removing items which showed excessive local dependence, bartlett's sphericity and Kaiser-Meyer-Olkin were checked again, with Kaiser Meyer-Olkin = `r kmo$MSA`, and Barlett's sphericity $\chi^2$(`r bart$df`) = `r bart$chisq`, p = `r bart$p.value`.```{r}#| output: false# Initial Factor Analysisefa <- EFA.dimensions::EFA(hr1,extraction ="ml", Nfactors =4, corkind ="polychoric",rotation ="oblimin",verbose =FALSE)initial <- efa$pattern %>%as_tibble(rownames ="q") %>%left_join(items,by =c("q"="var_label")) %>%select(q, item, 2:5) %>%gt() %>%tab_style(locations =list(# Factor 1cells_body(columns =`Factor 1`,rows =abs(`Factor 1`) < .30 ),# Factor 2cells_body(columns =`Factor 2`,rows =abs(`Factor 2`) < .30 ),# Factor 3cells_body(columns =`Factor 3`,rows =abs(`Factor 3`) < .30 ),# Factor 4cells_body(columns =`Factor 4`,rows =abs(`Factor 4`) < .30 )),style =list(cell_text(color ='gray'))) %>%fmt_number(columns =starts_with("Factor"),decimals =3) ```### Initial EFA ModelAn initial factor analysis revealed that a four factor solution provided poor fit for the data, with factor 1 explaining 41% of the total variance (eigenvalue = 14.57), factor 2 explaining 8% of the total variance (eigenvalue = 2.8), factor 3 explaining 3% of the total variance (eigenvalue = 1.05), and factor 4 explaining 2.5% of the total variance (eigenvalue = .90). This model had the following model fit indices: $\chi^2$(`r efa$dfMODEL`) = `r round(efa$chisqMODEL, 2)`, p = `r efa$pvalue`; CLI = `r round(efa$fit_coefs$CFI, 3)`; TLI = `r round(efa$fit_coefs$TLI, 3)`; RMSEA = `r round(efa$fit_coefs$RMSEA, 3)`. In addition to poor fit, there were numerous cross-loadings which exceeded the limit of difference of .2. Table 6 provides full factor loadings.After removing items with no primary loading or close cross loadings, a three factor solution provided the best fit. At this point, factor 3 consisted of 6 items which explained only 22% of the total variance and had 4 factor loadings less that .55. Between these quantitative reasons and inspecting the qualitative content of the items, they were removed. These were items 2, 9, 28, 29, and 32. The content of these items focused primarily on treatment aspects of harm reduction, where were less directly relevant to constructs of interest. ```{r}#| output: falsehr2 <- df %>%select(# Factor 1 q7, q1, q6, q18, q19, q21, q24, q14, q13,# Factor 2 q33, q42, q36, q35, q38, q34, q41)efa.final <- EFA.dimensions::EFA(hr2,extraction ="ml", Nfactors =2, corkind ="polychoric",rotation ="oblimin",verbose =FALSE)final <- efa.final$pattern %>%as_tibble(rownames ="q") %>%left_join(items,by =c("q"="var_label")) %>%select(q, item, 2:3) %>%gt() %>%tab_style(locations =list(# Factor 1cells_body(columns =`Factor 1`,rows =abs(`Factor 1`) < .30),# Factor 2cells_body(columns =`Factor 2`,rows =abs(`Factor 2`) < .30 )),style =list(cell_text(color ='gray'))) %>%fmt_number(columns =starts_with("Factor"),decimals =3) ```### Final EFA ModelThe resulting model was a two factor model which consisted of 15 items. Factor 1 explained 59% of the variance, and Factor 2 explained 41% of the variance. This model had the following model fit indices: $\chi^2$(`r efa.final$dfMODEL`) = `r round(efa.final$chisqMODEL, 2)`, p = `r efa.final$pvalue`; CLI = `r round(efa.final$fit_coefs$CFI, 3)`; TLI = `r round(efa.final$fit_coefs$TLI, 3)`; RMSEA = `r round(efa.final$fit_coefs$RMSEA, 3)`, which indicated acceptable fit. Due to the low loading in this study, item "q39" is a candidate for removal following the next round of data collection.```{r}left_join( initial$`_data`, final$`_data`,by =c("q"="q" )) %>%arrange(factor(q,levels =c(# Factor 1"q7", "q1", "q6", "q18", "q19", "q21", "q24", "q14", "q13",# Factor 2 "q33", "q42", "q36", "q35", "q38", "q34", "q41"))) %>% janitor::clean_names() %>%select(-q, -item_y) %>%gt() %>%tab_style(locations =list(# Factor 1cells_body(columns = factor_1_x,rows =abs(factor_1_x) < .30),# Factor 2cells_body(columns = factor_2_x,rows =abs(factor_2_x) < .30 ),# Factor 3cells_body(columns = factor_3,rows =abs(factor_3) < .30 ),# Factor 4cells_body(columns = factor_4,rows =abs(factor_4) < .30 ),# Factor 1 - Finalcells_body(columns = factor_1_y,rows =abs(factor_1_y) < .30),# Factor 2 - Finalcells_body(columns = factor_2_y,rows =abs(factor_2_y) < .30 )),style =list(cell_text(color ='gray'))) %>%fmt_number(columns =starts_with("Factor"),decimals =3) %>%fmt_missing() %>%tab_header(title ="Table 8: Initial and Final Factor Structure for 2 factors" ) %>%tab_spanner(label ="Initial 4 Factor Solution",columns =2:5 ) %>%tab_spanner(label ="Final 2 Factor Solution",columns =6:7 ) %>%cols_label(item_x ="Item",factor_1_x ="Factor 1",factor_2_x ="Factor 2",factor_3 ="Factor 3",factor_4 ="Factor 4",factor_1_y ="Factor 1",factor_2_y ="Factor 2" )```The resulting scale consists of the following items:```{r}final.items <- hr %>%select(# Factor 1 q7, q1, q6, q18, q19, q21, q24, q14, q13,# Factor 2 q33, q42, q36, q35, q38, q34, q41)items <- readr::read_csv("references/codebooks/20231025_hr-scale-codebook.csv")items %>%filter(grepl("q", var_label) ) %>%filter(var_label %in%c(# Factor 1"q7", "q1", "q6", "q18", "q19", "q21", "q24", "q14", "q13",# Factor 2 "q33", "q42", "q36", "q35", "q38", "q34", "q41" )) %>%arrange(factor( var_label,levels =c(# Factor 1# Factor 1"q7", "q1", "q6", "q18", "q19", "q21", "q24", "q14", "q13",# Factor 2 "q33", "q42", "q36", "q35", "q38", "q34", "q41" ))) %>%mutate(item =case_when( var_label %in%c("q42", "q35", "q34", "q41", "q39") ~paste0(item, "*"),TRUE~ item ),Factor =c(rep("Factor 1", 9), rep("Factor 2", 7)) ) %>%select(Factor, item) %>%gt(groupname_col ="Factor") %>%tab_header(title ="Table 9: Final Items" ) %>%cols_label(item ="" ) %>%tab_footnote("* indicates reverse coded items")```## Scale Inspection::: columns::: {.column width="35%"}The first factor shows good reliability, with alpha, split half, and greatest lower bound of reliability greater than .93. The second factor has acceptable split half and internal reliability, but it's greatest lower bound is very low, which continues to be a concern. Overall, between the number of items, EFA fit statistics, and reliability, this model provides better fit and output compared to the initial model.:::::: {.column width="10%"}<!-- Empty -->:::::: {.column width="55%"}```{r}scale1 <- df %>%select(q7, q1, q6, q24, q14, q18, q19, q13, q21)scale2 <- df %>%select(q42, q36, q33, q35, q38, q34, q41)splithalf1 <-splitHalf(scale1) ; splithalf2 <-splitHalf(scale2) alpha1 <- psych::alpha(scale1, check.keys =TRUE) ; alpha2 <- psych::alpha(scale2, check.keys =TRUE) glb1 <-glb.algebraic(cor(scale1, use ="pairwise.complete.obs")) ; glb2 <-glb.algebraic(cor(scale2, use ="pairwise.complete.obs")) data.frame(Factor =c("Factor 1", "Factor 1", "Factor 1", "Factor 2", "Factor 2", "Factor 2"),`Reliability Statistic`=c("Average Split Half", "Alpha", "Greatest Lower Bound","Average Split Half", "Alpha", "Greatest Lower Bound"),Statistic =c(round(splithalf1$meanr,2),round(alpha1$total$raw_alpha, 2), round(glb1$glb, 2),round(splithalf2$meanr,2),round(alpha2$total$raw_alpha, 2), round(glb2$glb, 2))) %>% gt::gt(groupname_col ="Factor") %>% gt::tab_header(title ="Table 10: Reliability Statistics For All Scales")```::::::### Item Response DistributionBoth factors show a similar distribution to in the initial study. Some more extreme response options show under use, but across all items response options are used. ```{r}#| column: screen#| out-width: 80%#| layout-ncol: 2#| fig-cap: #| - "Figure 5: Study 2 EFA, Factor 1 Response Distribution"#| - "Figure 6: Study 2 EFA, Factor 2 Response Distribution"data.frame(alpha1$response.freq) %>%mutate(item =rownames(.)) %>%left_join(item %>%select(qname, question), by =c("item"="qname")) %>%pivot_longer(c(X1:X5)) %>%mutate(Response =factor(case_when( name =="X1"~"Strongly disagree", name =="X2"~"Somewhat disagree", name =="X3"~"Neither agree nor disagree", name =="X4"~"Somewhat agree", name =="X5"~"Strongly agree" ),levels =c("Strongly disagree", "Somewhat disagree","Neither agree nor disagree","Somewhat agree","Strongly agree"))) %>%ggplot() +aes(x = question, y = value, fill = Response) +geom_bar(stat ="identity", position ="stack") +geom_text(aes(label =paste0(round(value*100,0),"%")), position =position_stack(vjust = .5),size =3) +coord_flip() + ggforce::facet_col(facets =vars(question), scales ="free_y", space ="free",labeller =label_wrap_gen(width =100) ) +labs(title ="Factor 1") +theme(axis.title =element_blank(),strip.background =element_blank(),legend.position ="top",axis.ticks =element_blank(),axis.text =element_blank(),panel.background =element_blank(),strip.text =element_text(size =10),panel.spacing =unit(-.5, "lines"),legend.title =element_blank(),strip.clip ="off",plot.title =element_text(hjust =0.5, size =20) )data.frame(alpha2$response.freq) %>%mutate(item =rownames(.)) %>%left_join(item %>%select(qname, question), by =c("item"="qname")) %>%pivot_longer(c(X1:X5)) %>%mutate(Response =factor(case_when( name =="X1"~"Strongly disagree", name =="X2"~"Somewhat disagree", name =="X3"~"Neither agree nor disagree", name =="X4"~"Somewhat agree", name =="X5"~"Strongly agree"),levels =c("Strongly disagree", "Somewhat disagree","Neither agree nor disagree","Somewhat agree","Strongly agree"))) %>%ggplot() +aes(x = question, y = value, fill = Response) +geom_bar(stat ="identity", position ="stack") +geom_text(aes(label =paste0(round(value*100,0),"%")), position =position_stack(vjust = .5),size =3) +coord_flip() + ggforce::facet_col(facets =vars(question), scales ="free_y", space ="free",labeller =label_wrap_gen(width =100) ) +labs(title ="Factor 2") +theme(axis.title =element_blank(),strip.background =element_blank(),legend.position ="top",axis.ticks =element_blank(),axis.text =element_blank(),panel.background =element_blank(),strip.text =element_text(size =10),panel.spacing =unit(-.5, "lines"),legend.title =element_blank(),strip.clip ="off",plot.title =element_text(hjust =0.5, size =20) )```# Study 3Study 3 was preregistered, which can be accessed at <https://osf.io/q4uzv>. In this study, we conducted confirmatory factor analysis using new data based on the factor structure in Study 2. The confirmatory factor analysis was fit using weighted least squares estimation treating categories as ordinal rather than numeric, which was a deviation from the preregistration. We will use Hu & Bentler's (2009) cut offs of .95 for TLI and CFI, and .06 for RMSEA.## ParticipantsThrough Prolific, we recruited 402 participants for Study 2. These participants were 52% men (n = 205), 70% White (n = 277), and most had completed a bachelor's degree (41%, n = 161), and 88% reported no history of SUD diagnosis. Table 6 provides detail about the participants. ```{r}df <-read.csv("data/clean/20231025_hr-scale-confirmatory-data2.csv")demos <- df %>%select(age, gid1, race, ed, sud_hx)demos %>%tbl_summary(missing ="no",statistic =list(age ="{mean} ({sd})"),digits =list(age ~c(2,2)),sort =list(everything() ~"frequency"),label =list( age ~"Age", gid1 ~"Gender Identity", race ~"Race/Ethnicity", ed ~"Highest Level of Education", sud_hx ~"History of Substance Use Disorder Diagnosis") ) %>%as_gt() %>%tab_header(title ="Table 11: Study 3 Participants")```## Confirmatory Factor AnalysisThe model developed in study 2 was fit to study 3's data using confirmatory factor analysis with least square estimation and ordingal indicators. The model showed acceptablve fit on most indicators: $\chi^2$ (103) = 314.68, p < .001; CFI = .98, TLI = .97; RMSEA = .07; SRMR = .19. Figure 7 provides factor loadings and covariances for the model.```{r}#| column: margin# Import datadf <-read.csv("data/clean/20231025_hr-scale-confirmatory-data2.csv")# Select itemshr <- df %>%# N = 301select(starts_with("q"), -Q_RecaptchaScore)# Study 2 Model Fitmodel <-'F1 =~ q7 + q1 + q6 + q18 + q19 + q21 + q24 + q14 + q13 F2 =~ q33 + q42 + q36 + q35 + q38 + q34 + q41 F1 ~~ F2'fit <-cfa( model,data = df,ordered =TRUE,estimator ="WLS",std.lv =TRUE) fitmeasures(fit,c("cfi", "tli", "srmr", "rmsea", "chisq", "df", "pvalue")) %>%as_tibble(rownames ="Item") %>%gt() %>%tab_header(title ="Table 12: Study 3 CFA Fit Statistics" ) %>%fmt_number(columns =2, decimals =2)``````{r}#| column: page#| fig-cap: "Figure 7: Study 3 CFA"plot(set_cfa_layout(semPaths(fit, whatLabels="est",sizeMan =3,node.width =1,thresholds =FALSE,edge.label.cex = .75,style ="ram",mar =c(5, 1, 5, 1),intercepts =FALSE,DoNotPlot =TRUE),fcov_curve =1.75,loading_position = .9))```## Scale Inspection```{r}meanscores <- df %>%mutate(across(c(q42, q35, q34, q41), ~6- .),Strategies =rowMeans(df %>%select(all_of(colnames(scale1))), na.rm =TRUE),Principles =rowMeans(df %>%select(all_of(colnames(scale2))), na.rm =TRUE))``````{r}#| column: marginscale1 <- df %>%select(q7, q1, q6, q24, q14, q18, q19, q13, q21)scale2 <- df %>%select(q42, q36, q33, q35, q38, q34, q41)splithalf1 <-splitHalf(scale1) ; splithalf2 <-splitHalf(scale2) alpha1 <- psych::alpha(scale1, check.keys =TRUE) ; alpha2 <- psych::alpha(scale2, check.keys =TRUE) glb1 <-glb.algebraic(cor(scale1, use ="pairwise.complete.obs")) ; glb2 <-glb.algebraic(cor(scale2, use ="pairwise.complete.obs")) data.frame(Factor =c("Factor 1", "Factor 1", "Factor 1", "Factor 2", "Factor 2", "Factor 2"),`Reliability Statistic`=c("Average Split Half", "Alpha", "Greatest Lower Bound","Average Split Half", "Alpha", "Greatest Lower Bound"),Statistic =c(round(splithalf1$meanr,2),round(alpha1$total$raw_alpha, 2), round(glb1$glb, 2),round(splithalf2$meanr,2),round(alpha2$total$raw_alpha, 2), round(glb2$glb, 2))) %>% gt::gt(groupname_col ="Factor") %>% gt::tab_header(title ="Table 10: Reliability Statistics For All Scales")```The first factor shows good reliability, with alpha, split half, and greatest lower bound of reliability greater than .93. The second factor has acceptable split half and internal reliability, but it's greatest lower bound is very low, which continues to be a concern. Overall, between the number of items, EFA fit statistics, and reliability, this model provides better fit and output compared to the initial model.Table 11 provides detail about mean, median, and standard deviation for both subscales. The Strategies subscale may show some ceiling effect, as scores are higher above the mean, while the Principles subscale shows normal, but slightly skewed distribution. The two subscales show little to no relationship (r = -0.07, 95% CI -0.16, 0.03).```{r}gtExtras::gt_plt_summary(meanscores[60:61],title ="Table 11: Mean Score for Both Factors")```### Item Response DistributionBoth factors show a similar distribution to in the initial study. Some more extreme response options show under use, but across all items response options are used. ```{r}#| column: screen#| out-width: 80%#| layout-ncol: 2#| fig-cap: #| - "Figure 8: Study 3, Strategies Subscale Response Distribution"#| - "Figure 9: Study 3, Principles Subscale Response Distribution"items1 <-data.frame(alpha1$response.freq) %>%mutate(item =rownames(.)) %>%left_join(items %>%select(var_label, item), by =c("item"="var_label")) %>%pivot_longer(c(X1:X5)) %>%mutate(Response =factor(case_when( name =="X1"~"Strongly disagree", name =="X2"~"Somewhat disagree", name =="X3"~"Neither agree nor disagree", name =="X4"~"Somewhat agree", name =="X5"~"Strongly agree" ),levels =c("Strongly agree","Somewhat agree","Neither agree nor disagree","Somewhat disagree","Strongly disagree"))) %>%ggplot() +aes(x = item.y, y = value, fill = Response) +geom_bar(stat ="identity", position ="stack") +geom_text(aes(label =paste0(round(value*100,0),"%")), position =position_stack(vjust = .5),size =3) + ggokabeito::scale_fill_okabe_ito(breaks =c("Strongly disagree","Somewhat disagree","Neither agree nor disagree","Somewhat agree","Strongly agree" )) +coord_flip() + ggforce::facet_col(facets =vars(item.y), scales ="free_y", space ="free",labeller =label_wrap_gen(width =100) ) +labs(title ="Harm Reduction Strategies") +theme(axis.title =element_blank(),strip.background =element_blank(),legend.position ="top",axis.ticks =element_blank(),axis.text =element_blank(),panel.background =element_blank(),strip.text =element_text(size =16),panel.spacing =unit(-.5, "lines"),legend.title =element_blank(),strip.clip ="off",plot.title =element_text(hjust =0.5, size =20) )items2 <-data.frame(alpha2$response.freq) %>%mutate(item =rownames(.)) %>%left_join(items %>%select(var_label, item), by =c("item"="var_label")) %>%pivot_longer(c(X1:X5)) %>%mutate(Response =factor(case_when( name =="X1"~"Strongly disagree", name =="X2"~"Somewhat disagree", name =="X3"~"Neither agree nor disagree", name =="X4"~"Somewhat agree", name =="X5"~"Strongly agree" ),levels =c("Strongly agree", "Somewhat agree", "Neither agree nor disagree", "Somewhat disagree", "Strongly disagree"))) %>%ggplot() +aes(x = item.y, y = value, fill = Response) +geom_bar(stat ="identity", position ="stack") +geom_text(aes(label =paste0(round(value*100,0),"%")), position =position_stack(vjust = .5),size =3) + ggokabeito::scale_fill_okabe_ito(breaks =c("Strongly disagree","Somewhat disagree","Neither agree nor disagree","Somewhat agree","Strongly agree" )) +coord_flip() + ggforce::facet_col(facets =vars(item.y), scales ="free_y", space ="free",labeller =label_wrap_gen(width =100) ) +labs(title ="Harm Reduction Principles") +theme(axis.title =element_blank(),strip.background =element_blank(),legend.position ="top",axis.ticks =element_blank(),axis.text =element_blank(),panel.background =element_blank(),strip.text =element_text(size =16),panel.spacing =unit(-.5, "lines"),legend.title =element_blank(),strip.clip ="off",plot.title =element_text(hjust =0.5, size =20) )```Finally, figure 10 shows mean score distributions for both scales in the current sample. As noted above, the Strategies subscale may show some ceiling effect, while the Principles subscale does not. While these two separate subscales need additional evidence, this factor structure for these 16 items demonstrates acceptable reliability and structural validity. ```{r}#| fig-cap: "Figure 10: Mean Score Distribution"(scores <- meanscores %>%pivot_longer(Strategies:Principles) %>%mutate(sud_hx =case_when( sud_hx =="I don’t know"~"No", sud_hx =="Prefer not to say"~"No",TRUE~ sud_hx )) %>%group_by( name, #race#sud_hx, #gid1 ) %>%mutate(avg =mean(value, na.rm =TRUE)) %>%ungroup() %>%ggplot() +aes(x = name, y = value, fill = name) +geom_violin() +geom_jitter(alpha = .3, height =0.1) +geom_hline(aes(yintercept = avg, col = name),size =2, alpha = .5) +labs(title ="Harm Reduction Subscale Scores")+ ggokabeito::scale_fill_okabe_ito() + ggokabeito::scale_color_okabe_ito() + cowplot::theme_half_open() +theme(axis.title =element_blank(),legend.position ="none",axis.text.x =element_blank(),axis.line.y =element_line(),axis.line.y.right =element_line(),panel.background =element_blank(),plot.title =element_text(hjust =0.5, size =25),strip.clip ="off" ) +facet_wrap(~ name, scales ="free_x",strip.position ="bottom"))```